Overview

Dataset statistics

Number of variables16
Number of observations20000
Missing cells8261
Missing cells (%)2.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 MiB
Average record size in memory128.0 B

Variable types

Numeric10
Categorical6

Warnings

name has a high cardinality: 19768 distinct values High cardinality
host_name has a high cardinality: 6517 distinct values High cardinality
neighbourhood has a high cardinality: 217 distinct values High cardinality
last_review has a high cardinality: 1507 distinct values High cardinality
last_review has 4123 (20.6%) missing values Missing
reviews_per_month has 4123 (20.6%) missing values Missing
minimum_nights is highly skewed (γ1 = 25.17996962) Skewed
name is uniformly distributed Uniform
id has unique values Unique
number_of_reviews has 4123 (20.6%) zeros Zeros
availability_365 has 7176 (35.9%) zeros Zeros

Reproduction

Analysis started2023-07-22 13:44:00.592058
Analysis finished2023-07-22 13:44:31.726211
Duration31.13 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

id
Real number (ℝ≥0)

UNIQUE

Distinct20000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18923800.78
Minimum2539
Maximum36485609
Zeros0
Zeros (%)0.0%
Memory size156.4 KiB
2023-07-22T19:14:31.980736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2539
5-th percentile1193873.85
Q19393540.5
median19521168.5
Q329129358.75
95-th percentile35275607.45
Maximum36485609
Range36483070
Interquartile range (IQR)19735818.25

Descriptive statistics

Standard deviation11012232.42
Coefficient of variation (CV)0.5819249812
Kurtosis-1.233322955
Mean18923800.78
Median Absolute Deviation (MAD)9896304
Skewness-0.07538052651
Sum3.784760157 × 1011
Variance1.212692628 × 1014
MonotocityNot monotonic
2023-07-22T19:14:32.397134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
324931121
 
< 0.1%
27282931
 
< 0.1%
307413021
 
< 0.1%
53547961
 
< 0.1%
363082661
 
< 0.1%
311943091
 
< 0.1%
155095451
 
< 0.1%
130737011
 
< 0.1%
207753701
 
< 0.1%
1691521
 
< 0.1%
Other values (19990)19990
> 99.9%
ValueCountFrequency (%)
25391
< 0.1%
38311
< 0.1%
50221
< 0.1%
51211
< 0.1%
52031
< 0.1%
ValueCountFrequency (%)
364856091
< 0.1%
364850571
< 0.1%
364802921
< 0.1%
364797231
< 0.1%
364783431
< 0.1%

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct19768
Distinct (%)98.9%
Missing7
Missing (%)< 0.1%
Memory size156.4 KiB
Brooklyn Apartment
 
7
Hillside Hotel
 
7
Home away from home
 
6
New york Multi-unit building
 
6
Private Room
 
6
Other values (19763)
19961 

Length

Max length179
Median length36
Mean length36.90246586
Min length1

Characters and Unicode

Total characters737791
Distinct characters511
Distinct categories20 ?
Distinct scripts11 ?
Distinct blocks17 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19599 ?
Unique (%)98.0%

Sample

1st rowPrivate Lg Room 15 min to Manhattan
2nd rowTIME SQUARE CHARMING ONE BED IN HELL'S KITCHEN,NYC
3rd rowVoted #1 Location Quintessential 1BR W Village Apt
4th rowSpacious 1 bedroom apartment 15min from Manhattan
5th rowBig beautiful bedroom in huge Bushwick apartment
ValueCountFrequency (%)
Brooklyn Apartment7
 
< 0.1%
Hillside Hotel7
 
< 0.1%
Home away from home6
 
< 0.1%
New york Multi-unit building6
 
< 0.1%
Private Room6
 
< 0.1%
Private room in Manhattan5
 
< 0.1%
Cozy Room5
 
< 0.1%
Cozy Private Room4
 
< 0.1%
Private room in Williamsburg4
 
< 0.1%
❤ of Manhattan | Fantastic 1 Bedroom3
 
< 0.1%
Other values (19758)19940
99.7%
(Missing)7
 
< 0.1%
2023-07-22T19:14:33.335107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
in6813
 
5.6%
room4038
 
3.3%
3388
 
2.8%
bedroom3145
 
2.6%
private2935
 
2.4%
apartment2748
 
2.2%
cozy2034
 
1.7%
apt1843
 
1.5%
brooklyn1666
 
1.4%
the1620
 
1.3%
Other values (7025)91941
75.3%

Most occurring characters

ValueCountFrequency (%)
102878
 
13.9%
e50869
 
6.9%
o50103
 
6.8%
t42994
 
5.8%
a42392
 
5.7%
r40165
 
5.4%
i38708
 
5.2%
n38494
 
5.2%
l21159
 
2.9%
m20160
 
2.7%
Other values (501)289869
39.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter493004
66.8%
Uppercase Letter110782
 
15.0%
Space Separator102880
 
13.9%
Other Punctuation13747
 
1.9%
Decimal Number10518
 
1.4%
Dash Punctuation2806
 
0.4%
Math Symbol1072
 
0.1%
Other Letter987
 
0.1%
Close Punctuation647
 
0.1%
Open Punctuation593
 
0.1%
Other values (10)755
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
37
 
3.7%
22
 
2.2%
21
 
2.1%
20
 
2.0%
19
 
1.9%
19
 
1.9%
16
 
1.6%
15
 
1.5%
15
 
1.5%
13
 
1.3%
Other values (301)790
80.0%
ValueCountFrequency (%)
e50869
 
10.3%
o50103
 
10.2%
t42994
 
8.7%
a42392
 
8.6%
r40165
 
8.1%
i38708
 
7.9%
n38494
 
7.8%
l21159
 
4.3%
m20160
 
4.1%
s19736
 
4.0%
Other values (48)128224
26.0%
ValueCountFrequency (%)
129
32.9%
70
17.9%
31
 
7.9%
19
 
4.8%
17
 
4.3%
14
 
3.6%
13
 
3.3%
8
 
2.0%
7
 
1.8%
6
 
1.5%
Other values (32)78
19.9%
ValueCountFrequency (%)
B12285
 
11.1%
S10802
 
9.8%
C8612
 
7.8%
A7957
 
7.2%
R7399
 
6.7%
P6047
 
5.5%
E5906
 
5.3%
L5757
 
5.2%
M4841
 
4.4%
T4763
 
4.3%
Other values (23)36413
32.9%
ValueCountFrequency (%)
,3685
26.8%
!3209
23.3%
/2153
15.7%
.1778
12.9%
&1292
 
9.4%
'441
 
3.2%
*402
 
2.9%
:239
 
1.7%
#232
 
1.7%
"113
 
0.8%
Other values (9)203
 
1.5%
ValueCountFrequency (%)
13559
33.8%
22852
27.1%
31087
 
10.3%
5916
 
8.7%
0841
 
8.0%
4562
 
5.3%
6253
 
2.4%
8167
 
1.6%
7164
 
1.6%
9117
 
1.1%
ValueCountFrequency (%)
+529
49.3%
|397
37.0%
~108
 
10.1%
=12
 
1.1%
>10
 
0.9%
<7
 
0.7%
6
 
0.6%
2
 
0.2%
×1
 
0.1%
ValueCountFrequency (%)
(568
95.8%
[16
 
2.7%
{4
 
0.7%
3
 
0.5%
2
 
0.3%
ValueCountFrequency (%)
)621
96.0%
]17
 
2.6%
}4
 
0.6%
3
 
0.5%
2
 
0.3%
ValueCountFrequency (%)
-2771
98.8%
19
 
0.7%
16
 
0.6%
ValueCountFrequency (%)
102878
> 99.9%
 2
 
< 0.1%
ValueCountFrequency (%)
95
87.2%
14
 
12.8%
ValueCountFrequency (%)
72
88.9%
9
 
11.1%
ValueCountFrequency (%)
15
78.9%
4
 
21.1%
ValueCountFrequency (%)
7
70.0%
3
30.0%
ValueCountFrequency (%)
^6
85.7%
`1
 
14.3%
ValueCountFrequency (%)
74
100.0%
ValueCountFrequency (%)
$36
100.0%
ValueCountFrequency (%)
_20
100.0%
ValueCountFrequency (%)
²7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin603703
81.8%
Common132937
 
18.0%
Han889
 
0.1%
Inherited81
 
< 0.1%
Cyrillic71
 
< 0.1%
Katakana45
 
< 0.1%
Hebrew31
 
< 0.1%
Georgian12
 
< 0.1%
Hiragana12
 
< 0.1%
Hangul8
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
37
 
4.2%
22
 
2.5%
21
 
2.4%
20
 
2.2%
19
 
2.1%
19
 
2.1%
16
 
1.8%
15
 
1.7%
15
 
1.7%
13
 
1.5%
Other values (246)692
77.8%
ValueCountFrequency (%)
102878
77.4%
,3685
 
2.8%
13559
 
2.7%
!3209
 
2.4%
22852
 
2.1%
-2771
 
2.1%
/2153
 
1.6%
.1778
 
1.3%
&1292
 
1.0%
31087
 
0.8%
Other values (97)7673
 
5.8%
ValueCountFrequency (%)
e50869
 
8.4%
o50103
 
8.3%
t42994
 
7.1%
a42392
 
7.0%
r40165
 
6.7%
i38708
 
6.4%
n38494
 
6.4%
l21159
 
3.5%
m20160
 
3.3%
s19736
 
3.3%
Other values (61)238923
39.6%
ValueCountFrequency (%)
7
15.6%
5
 
11.1%
3
 
6.7%
3
 
6.7%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
1
 
2.2%
1
 
2.2%
Other values (15)15
33.3%
ValueCountFrequency (%)
а10
14.1%
н7
9.9%
т7
9.9%
о6
 
8.5%
с6
 
8.5%
к5
 
7.0%
м4
 
5.6%
е4
 
5.6%
я3
 
4.2%
р3
 
4.2%
Other values (9)16
22.5%
ValueCountFrequency (%)
י5
16.1%
ו5
16.1%
ב4
12.9%
ר4
12.9%
ע2
 
6.5%
ת2
 
6.5%
ה2
 
6.5%
ד1
 
3.2%
ש1
 
3.2%
ל1
 
3.2%
Other values (4)4
12.9%
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
ValueCountFrequency (%)
5
41.7%
2
 
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
ValueCountFrequency (%)
72
88.9%
9
 
11.1%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
12
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII735931
99.7%
CJK889
 
0.1%
Misc Symbols220
 
< 0.1%
Punctuation200
 
< 0.1%
None182
 
< 0.1%
Dingbats143
 
< 0.1%
VS81
 
< 0.1%
Cyrillic71
 
< 0.1%
Hebrew31
 
< 0.1%
Georgian12
 
< 0.1%
Other values (7)31
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
102878
 
14.0%
e50869
 
6.9%
o50103
 
6.8%
t42994
 
5.8%
a42392
 
5.8%
r40165
 
5.5%
i38708
 
5.3%
n38494
 
5.2%
l21159
 
2.9%
m20160
 
2.7%
Other values (86)288009
39.1%
ValueCountFrequency (%)
70
49.0%
17
 
11.9%
13
 
9.1%
8
 
5.6%
7
 
4.9%
5
 
3.5%
4
 
2.8%
4
 
2.8%
3
 
2.1%
3
 
2.1%
Other values (5)9
 
6.3%
ValueCountFrequency (%)
37
 
4.2%
22
 
2.5%
21
 
2.4%
20
 
2.2%
19
 
2.1%
19
 
2.1%
16
 
1.8%
15
 
1.7%
15
 
1.7%
13
 
1.5%
Other values (246)692
77.8%
ValueCountFrequency (%)
95
47.5%
37
 
18.5%
19
 
9.5%
16
 
8.0%
15
 
7.5%
14
 
7.0%
4
 
2.0%
ValueCountFrequency (%)
129
58.6%
31
 
14.1%
14
 
6.4%
6
 
2.7%
6
 
2.7%
6
 
2.7%
5
 
2.3%
3
 
1.4%
3
 
1.4%
3
 
1.4%
Other values (8)14
 
6.4%
ValueCountFrequency (%)
72
88.9%
9
 
11.1%
ValueCountFrequency (%)
19
 
10.4%
à15
 
8.2%
ó9
 
4.9%
7
 
3.8%
·7
 
3.8%
²7
 
3.8%
é7
 
3.8%
7
 
3.8%
7
 
3.8%
6
 
3.3%
Other values (51)91
50.0%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
1
100.0%
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
12
100.0%
ValueCountFrequency (%)
а10
14.1%
н7
9.9%
т7
9.9%
о6
 
8.5%
с6
 
8.5%
к5
 
7.0%
м4
 
5.6%
е4
 
5.6%
я3
 
4.2%
р3
 
4.2%
Other values (9)16
22.5%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
5
41.7%
2
 
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
ValueCountFrequency (%)
י5
16.1%
ו5
16.1%
ב4
12.9%
ר4
12.9%
ע2
 
6.5%
ת2
 
6.5%
ה2
 
6.5%
ד1
 
3.2%
ש1
 
3.2%
ל1
 
3.2%
Other values (4)4
12.9%
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

host_id
Real number (ℝ≥0)

Distinct17027
Distinct (%)85.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67460344.07
Minimum2571
Maximum274273284
Zeros0
Zeros (%)0.0%
Memory size156.4 KiB
2023-07-22T19:14:33.705297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2571
5-th percentile779450
Q17853718.25
median31114309.5
Q3106842560
95-th percentile242095840.9
Maximum274273284
Range274270713
Interquartile range (IQR)98988841.75

Descriptive statistics

Standard deviation78579364.8
Coefficient of variation (CV)1.164823066
Kurtosis0.2087986308
Mean67460344.07
Median Absolute Deviation (MAD)27863859.5
Skewness1.219649017
Sum1.349206881 × 1012
Variance6.174716572 × 1015
MonotocityNot monotonic
2023-07-22T19:14:34.087113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
219517861131
 
0.7%
10743442389
 
0.4%
3028359453
 
0.3%
1224305137
 
0.2%
1609895836
 
0.2%
13735886636
 
0.2%
6139196334
 
0.2%
2254157332
 
0.2%
20038061027
 
0.1%
285674824
 
0.1%
Other values (17017)19501
97.5%
ValueCountFrequency (%)
25711
 
< 0.1%
27873
< 0.1%
31511
 
< 0.1%
34151
 
< 0.1%
35631
 
< 0.1%
ValueCountFrequency (%)
2742732841
< 0.1%
2741954581
< 0.1%
2741033831
< 0.1%
2740799641
< 0.1%
2738701231
< 0.1%

host_name
Categorical

HIGH CARDINALITY

Distinct6517
Distinct (%)32.6%
Missing8
Missing (%)< 0.1%
Memory size156.4 KiB
David
 
170
Michael
 
167
John
 
133
Sonder (NYC)
 
131
Alex
 
105
Other values (6512)
19286 

Length

Max length35
Median length6
Mean length6.112444978
Min length1

Characters and Unicode

Total characters122200
Distinct characters140
Distinct categories11 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4203 ?
Unique (%)21.0%

Sample

1st rowIris
2nd rowJohlex
3rd rowJohn
4th rowRegan
5th rowMegan
ValueCountFrequency (%)
David170
 
0.9%
Michael167
 
0.8%
John133
 
0.7%
Sonder (NYC)131
 
0.7%
Alex105
 
0.5%
Daniel95
 
0.5%
Blueground89
 
0.4%
Maria87
 
0.4%
Sarah87
 
0.4%
Chris83
 
0.4%
Other values (6507)18845
94.2%
2023-07-22T19:14:35.153434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
428
 
1.9%
and250
 
1.1%
david185
 
0.8%
michael183
 
0.8%
sonder168
 
0.8%
john148
 
0.7%
nyc137
 
0.6%
alex129
 
0.6%
laura121
 
0.5%
maria109
 
0.5%
Other values (6048)20426
91.7%

Most occurring characters

ValueCountFrequency (%)
a15469
 
12.7%
e11787
 
9.6%
i9892
 
8.1%
n9892
 
8.1%
r7360
 
6.0%
l6239
 
5.1%
o5194
 
4.3%
t3817
 
3.1%
s3737
 
3.1%
h3686
 
3.0%
Other values (130)45127
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter96373
78.9%
Uppercase Letter22409
 
18.3%
Space Separator2334
 
1.9%
Other Punctuation616
 
0.5%
Open Punctuation149
 
0.1%
Close Punctuation149
 
0.1%
Dash Punctuation77
 
0.1%
Other Letter41
 
< 0.1%
Decimal Number36
 
< 0.1%
Math Symbol15
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
a15469
16.1%
e11787
12.2%
i9892
10.3%
n9892
10.3%
r7360
 
7.6%
l6239
 
6.5%
o5194
 
5.4%
t3817
 
4.0%
s3737
 
3.9%
h3686
 
3.8%
Other values (46)19300
20.0%
ValueCountFrequency (%)
A2627
11.7%
J2215
 
9.9%
M2152
 
9.6%
S1914
 
8.5%
C1517
 
6.8%
L1177
 
5.3%
D1132
 
5.1%
R1085
 
4.8%
K1070
 
4.8%
E970
 
4.3%
Other values (22)6550
29.2%
ValueCountFrequency (%)
3
 
7.3%
3
 
7.3%
3
 
7.3%
3
 
7.3%
2
 
4.9%
2
 
4.9%
2
 
4.9%
2
 
4.9%
1
 
2.4%
1
 
2.4%
Other values (19)19
46.3%
ValueCountFrequency (%)
&445
72.2%
.129
 
20.9%
/14
 
2.3%
,13
 
2.1%
'7
 
1.1%
!4
 
0.6%
@3
 
0.5%
:1
 
0.2%
ValueCountFrequency (%)
511
30.6%
06
16.7%
76
16.7%
14
 
11.1%
23
 
8.3%
43
 
8.3%
62
 
5.6%
31
 
2.8%
ValueCountFrequency (%)
2330
99.8%
4
 
0.2%
ValueCountFrequency (%)
(149
100.0%
ValueCountFrequency (%)
)149
100.0%
ValueCountFrequency (%)
-77
100.0%
ValueCountFrequency (%)
+15
100.0%
ValueCountFrequency (%)
£1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin118759
97.2%
Common3377
 
2.8%
Han36
 
< 0.1%
Cyrillic23
 
< 0.1%
Hiragana3
 
< 0.1%
Hangul2
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
a15469
 
13.0%
e11787
 
9.9%
i9892
 
8.3%
n9892
 
8.3%
r7360
 
6.2%
l6239
 
5.3%
o5194
 
4.4%
t3817
 
3.2%
s3737
 
3.1%
h3686
 
3.1%
Other values (63)41686
35.1%
ValueCountFrequency (%)
3
 
8.3%
3
 
8.3%
3
 
8.3%
3
 
8.3%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
Other values (14)14
38.9%
ValueCountFrequency (%)
2330
69.0%
&445
 
13.2%
(149
 
4.4%
)149
 
4.4%
.129
 
3.8%
-77
 
2.3%
+15
 
0.4%
/14
 
0.4%
,13
 
0.4%
511
 
0.3%
Other values (13)45
 
1.3%
ValueCountFrequency (%)
А3
13.0%
е3
13.0%
н2
 
8.7%
й2
 
8.7%
и2
 
8.7%
л2
 
8.7%
д1
 
4.3%
р1
 
4.3%
т1
 
4.3%
а1
 
4.3%
Other values (5)5
21.7%
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII122025
99.9%
None107
 
0.1%
CJK36
 
< 0.1%
Cyrillic23
 
< 0.1%
Punctuation4
 
< 0.1%
Hiragana3
 
< 0.1%
Hangul2
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
a15469
 
12.7%
e11787
 
9.7%
i9892
 
8.1%
n9892
 
8.1%
r7360
 
6.0%
l6239
 
5.1%
o5194
 
4.3%
t3817
 
3.1%
s3737
 
3.1%
h3686
 
3.0%
Other values (63)44952
36.8%
ValueCountFrequency (%)
é43
40.2%
í10
 
9.3%
á10
 
9.3%
ë8
 
7.5%
ô6
 
5.6%
è6
 
5.6%
ú5
 
4.7%
ó2
 
1.9%
ç2
 
1.9%
ı2
 
1.9%
Other values (12)13
 
12.1%
ValueCountFrequency (%)
3
 
8.3%
3
 
8.3%
3
 
8.3%
3
 
8.3%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
Other values (14)14
38.9%
ValueCountFrequency (%)
4
100.0%
ValueCountFrequency (%)
А3
13.0%
е3
13.0%
н2
 
8.7%
й2
 
8.7%
и2
 
8.7%
л2
 
8.7%
д1
 
4.3%
р1
 
4.3%
т1
 
4.3%
а1
 
4.3%
Other values (5)5
21.7%
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.4 KiB
Manhattan
8774 
Brooklyn
8265 
Queens
2355 
Bronx
 
441
Staten Island
 
165

Length

Max length13
Median length8
Mean length8.1783
Min length5

Characters and Unicode

Total characters163566
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowQueens
2nd rowManhattan
3rd rowManhattan
4th rowQueens
5th rowBrooklyn
ValueCountFrequency (%)
Manhattan8774
43.9%
Brooklyn8265
41.3%
Queens2355
 
11.8%
Bronx441
 
2.2%
Staten Island165
 
0.8%
2023-07-22T19:14:35.803501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
2023-07-22T19:14:36.010092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
manhattan8774
43.5%
brooklyn8265
41.0%
queens2355
 
11.7%
bronx441
 
2.2%
staten165
 
0.8%
island165
 
0.8%

Most occurring characters

ValueCountFrequency (%)
n28939
17.7%
a26652
16.3%
t17878
10.9%
o16971
10.4%
M8774
 
5.4%
h8774
 
5.4%
B8706
 
5.3%
r8706
 
5.3%
l8430
 
5.2%
k8265
 
5.1%
Other values (10)21471
13.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter143236
87.6%
Uppercase Letter20165
 
12.3%
Space Separator165
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
n28939
20.2%
a26652
18.6%
t17878
12.5%
o16971
11.8%
h8774
 
6.1%
r8706
 
6.1%
l8430
 
5.9%
k8265
 
5.8%
y8265
 
5.8%
e4875
 
3.4%
Other values (4)5481
 
3.8%
ValueCountFrequency (%)
M8774
43.5%
B8706
43.2%
Q2355
 
11.7%
S165
 
0.8%
I165
 
0.8%
ValueCountFrequency (%)
165
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin163401
99.9%
Common165
 
0.1%

Most frequent character per script

ValueCountFrequency (%)
n28939
17.7%
a26652
16.3%
t17878
10.9%
o16971
10.4%
M8774
 
5.4%
h8774
 
5.4%
B8706
 
5.3%
r8706
 
5.3%
l8430
 
5.2%
k8265
 
5.1%
Other values (9)21306
13.0%
ValueCountFrequency (%)
165
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII163566
100.0%

Most frequent character per block

ValueCountFrequency (%)
n28939
17.7%
a26652
16.3%
t17878
10.9%
o16971
10.4%
M8774
 
5.4%
h8774
 
5.4%
B8706
 
5.3%
r8706
 
5.3%
l8430
 
5.2%
k8265
 
5.1%
Other values (10)21471
13.1%

neighbourhood
Categorical

HIGH CARDINALITY

Distinct217
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size156.4 KiB
Williamsburg
1580 
Bedford-Stuyvesant
1503 
Harlem
 
1116
Bushwick
 
987
Upper West Side
 
798
Other values (212)
14016 

Length

Max length26
Median length12
Mean length11.8819
Min length4

Characters and Unicode

Total characters237638
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)0.1%

Sample

1st rowSunnyside
2nd rowHell's Kitchen
3rd rowWest Village
4th rowAstoria
5th rowBushwick
ValueCountFrequency (%)
Williamsburg1580
 
7.9%
Bedford-Stuyvesant1503
 
7.5%
Harlem1116
 
5.6%
Bushwick987
 
4.9%
Upper West Side798
 
4.0%
Hell's Kitchen771
 
3.9%
East Village756
 
3.8%
Upper East Side706
 
3.5%
Crown Heights648
 
3.2%
Midtown643
 
3.2%
Other values (207)10492
52.5%
2023-07-22T19:14:36.778364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
east2690
 
8.3%
side1872
 
5.8%
williamsburg1580
 
4.9%
harlem1579
 
4.9%
upper1504
 
4.7%
bedford-stuyvesant1503
 
4.6%
heights1462
 
4.5%
village1301
 
4.0%
west1142
 
3.5%
bushwick987
 
3.1%
Other values (229)16713
51.7%

Most occurring characters

ValueCountFrequency (%)
e21824
 
9.2%
i17188
 
7.2%
s16285
 
6.9%
t15839
 
6.7%
a15504
 
6.5%
l14091
 
5.9%
r13837
 
5.8%
12333
 
5.2%
n10687
 
4.5%
o9832
 
4.1%
Other values (44)90218
38.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter188434
79.3%
Uppercase Letter34309
 
14.4%
Space Separator12333
 
5.2%
Dash Punctuation1736
 
0.7%
Other Punctuation826
 
0.3%

Most frequent character per category

ValueCountFrequency (%)
e21824
11.6%
i17188
 
9.1%
s16285
 
8.6%
t15839
 
8.4%
a15504
 
8.2%
l14091
 
7.5%
r13837
 
7.3%
n10687
 
5.7%
o9832
 
5.2%
d7969
 
4.2%
Other values (15)45378
24.1%
ValueCountFrequency (%)
H4904
14.3%
S4653
13.6%
B3373
9.8%
W3327
9.7%
E2888
8.4%
C2238
 
6.5%
U1535
 
4.5%
G1532
 
4.5%
F1380
 
4.0%
V1324
 
3.9%
Other values (14)7155
20.9%
ValueCountFrequency (%)
'776
93.9%
.49
 
5.9%
,1
 
0.1%
ValueCountFrequency (%)
12333
100.0%
ValueCountFrequency (%)
-1736
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin222743
93.7%
Common14895
 
6.3%

Most frequent character per script

ValueCountFrequency (%)
e21824
 
9.8%
i17188
 
7.7%
s16285
 
7.3%
t15839
 
7.1%
a15504
 
7.0%
l14091
 
6.3%
r13837
 
6.2%
n10687
 
4.8%
o9832
 
4.4%
d7969
 
3.6%
Other values (39)79687
35.8%
ValueCountFrequency (%)
12333
82.8%
-1736
 
11.7%
'776
 
5.2%
.49
 
0.3%
,1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII237638
100.0%

Most frequent character per block

ValueCountFrequency (%)
e21824
 
9.2%
i17188
 
7.2%
s16285
 
6.9%
t15839
 
6.7%
a15504
 
6.5%
l14091
 
5.9%
r13837
 
5.8%
12333
 
5.2%
n10687
 
4.5%
o9832
 
4.1%
Other values (44)90218
38.0%

latitude
Real number (ℝ≥0)

Distinct12439
Distinct (%)62.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.72845515
Minimum40.50873
Maximum40.91306
Zeros0
Zeros (%)0.0%
Memory size156.4 KiB
2023-07-22T19:14:37.096916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum40.50873
5-th percentile40.6451295
Q140.68942
median40.72273
Q340.76299
95-th percentile40.8256525
Maximum40.91306
Range0.40433
Interquartile range (IQR)0.07357

Descriptive statistics

Standard deviation0.05475507699
Coefficient of variation (CV)0.001344393663
Kurtosis0.1085655141
Mean40.72845515
Median Absolute Deviation (MAD)0.03658
Skewness0.2301693808
Sum814569.103
Variance0.002998118457
MonotocityNot monotonic
2023-07-22T19:14:37.466371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.686348
 
< 0.1%
40.722328
 
< 0.1%
40.718138
 
< 0.1%
40.694148
 
< 0.1%
40.726077
 
< 0.1%
40.686837
 
< 0.1%
40.705877
 
< 0.1%
40.718017
 
< 0.1%
40.680847
 
< 0.1%
40.719926
 
< 0.1%
Other values (12429)19927
99.6%
ValueCountFrequency (%)
40.508731
< 0.1%
40.522931
< 0.1%
40.530761
< 0.1%
40.538711
< 0.1%
40.538841
< 0.1%
ValueCountFrequency (%)
40.913061
< 0.1%
40.905271
< 0.1%
40.903911
< 0.1%
40.903561
< 0.1%
40.903291
< 0.1%

longitude
Real number (ℝ)

Distinct10181
Distinct (%)50.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.95212502
Minimum-74.23914
Maximum-73.71795
Zeros0
Zeros (%)0.0%
Memory size156.4 KiB
2023-07-22T19:14:37.829035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-74.23914
5-th percentile-74.0041805
Q1-73.98303
median-73.95564
Q3-73.93638
95-th percentile-73.864896
Maximum-73.71795
Range0.52119
Interquartile range (IQR)0.04665

Descriptive statistics

Standard deviation0.04655878323
Coefficient of variation (CV)-0.000629580059
Kurtosis4.938242884
Mean-73.95212502
Median Absolute Deviation (MAD)0.024895
Skewness1.255100378
Sum-1479042.5
Variance0.002167720296
MonotocityNot monotonic
2023-07-22T19:14:38.215086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.9858910
 
0.1%
-73.948299
 
< 0.1%
-73.957429
 
< 0.1%
-73.954279
 
< 0.1%
-73.951219
 
< 0.1%
-73.980439
 
< 0.1%
-73.957259
 
< 0.1%
-73.953328
 
< 0.1%
-73.956758
 
< 0.1%
-73.955098
 
< 0.1%
Other values (10171)19912
99.6%
ValueCountFrequency (%)
-74.239141
< 0.1%
-74.212381
< 0.1%
-74.202951
< 0.1%
-74.198261
< 0.1%
-74.196261
< 0.1%
ValueCountFrequency (%)
-73.717951
< 0.1%
-73.718291
< 0.1%
-73.725821
< 0.1%
-73.727161
< 0.1%
-73.727311
< 0.1%

room_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.4 KiB
Entire home/apt
10384 
Private room
9172 
Shared room
 
444

Length

Max length15
Median length15
Mean length13.5354
Min length11

Characters and Unicode

Total characters270708
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPrivate room
2nd rowEntire home/apt
3rd rowEntire home/apt
4th rowEntire home/apt
5th rowPrivate room
ValueCountFrequency (%)
Entire home/apt10384
51.9%
Private room9172
45.9%
Shared room444
 
2.2%
2023-07-22T19:14:39.012535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
2023-07-22T19:14:39.345800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
home/apt10384
26.0%
entire10384
26.0%
room9616
24.0%
private9172
22.9%
shared444
 
1.1%

Most occurring characters

ValueCountFrequency (%)
e30384
11.2%
t29940
11.1%
r29616
10.9%
o29616
10.9%
a20000
 
7.4%
20000
 
7.4%
m20000
 
7.4%
i19556
 
7.2%
h10828
 
4.0%
E10384
 
3.8%
Other values (7)50384
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter220324
81.4%
Uppercase Letter20000
 
7.4%
Space Separator20000
 
7.4%
Other Punctuation10384
 
3.8%

Most frequent character per category

ValueCountFrequency (%)
e30384
13.8%
t29940
13.6%
r29616
13.4%
o29616
13.4%
a20000
9.1%
m20000
9.1%
i19556
8.9%
h10828
 
4.9%
n10384
 
4.7%
p10384
 
4.7%
Other values (2)9616
 
4.4%
ValueCountFrequency (%)
E10384
51.9%
P9172
45.9%
S444
 
2.2%
ValueCountFrequency (%)
20000
100.0%
ValueCountFrequency (%)
/10384
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin240324
88.8%
Common30384
 
11.2%

Most frequent character per script

ValueCountFrequency (%)
e30384
12.6%
t29940
12.5%
r29616
12.3%
o29616
12.3%
a20000
8.3%
m20000
8.3%
i19556
8.1%
h10828
 
4.5%
E10384
 
4.3%
n10384
 
4.3%
Other values (5)29616
12.3%
ValueCountFrequency (%)
20000
65.8%
/10384
34.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII270708
100.0%

Most frequent character per block

ValueCountFrequency (%)
e30384
11.2%
t29940
11.1%
r29616
10.9%
o29616
10.9%
a20000
 
7.4%
20000
 
7.4%
m20000
 
7.4%
i19556
 
7.2%
h10828
 
4.0%
E10384
 
3.8%
Other values (7)50384
18.6%

price
Real number (ℝ≥0)

Distinct544
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean153.26905
Minimum0
Maximum10000
Zeros5
Zeros (%)< 0.1%
Memory size156.4 KiB
2023-07-22T19:14:39.603969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40
Q169
median105
Q3175
95-th percentile350
Maximum10000
Range10000
Interquartile range (IQR)106

Descriptive statistics

Standard deviation243.3256089
Coefficient of variation (CV)1.587571717
Kurtosis538.297578
Mean153.26905
Median Absolute Deviation (MAD)45
Skewness18.3046896
Sum3065381
Variance59207.35193
MonotocityNot monotonic
2023-07-22T19:14:39.959805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100856
 
4.3%
150821
 
4.1%
50636
 
3.2%
200590
 
2.9%
75570
 
2.9%
60555
 
2.8%
80531
 
2.7%
70482
 
2.4%
120471
 
2.4%
65471
 
2.4%
Other values (534)14017
70.1%
ValueCountFrequency (%)
05
< 0.1%
106
< 0.1%
112
 
< 0.1%
121
 
< 0.1%
131
 
< 0.1%
ValueCountFrequency (%)
100001
< 0.1%
99991
< 0.1%
85001
< 0.1%
77031
< 0.1%
75001
< 0.1%

minimum_nights
Real number (ℝ≥0)

SKEWED

Distinct75
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.9921
Minimum1
Maximum1250
Zeros0
Zeros (%)0.0%
Memory size156.4 KiB
2023-07-22T19:14:40.306743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q35
95-th percentile30
Maximum1250
Range1249
Interquartile range (IQR)4

Descriptive statistics

Standard deviation21.64544903
Coefficient of variation (CV)3.095700724
Kurtosis1072.167601
Mean6.9921
Median Absolute Deviation (MAD)1
Skewness25.17996962
Sum139842
Variance468.5254639
MonotocityNot monotonic
2023-07-22T19:14:40.671465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15248
26.2%
24796
24.0%
33309
16.5%
301540
 
7.7%
41328
 
6.6%
51208
 
6.0%
7855
 
4.3%
6307
 
1.5%
14217
 
1.1%
10183
 
0.9%
Other values (65)1009
 
5.0%
ValueCountFrequency (%)
15248
26.2%
24796
24.0%
33309
16.5%
41328
 
6.6%
51208
 
6.0%
ValueCountFrequency (%)
12501
< 0.1%
9992
< 0.1%
4801
< 0.1%
4001
< 0.1%
3701
< 0.1%

number_of_reviews
Real number (ℝ≥0)

ZEROS

Distinct323
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.2741
Minimum0
Maximum607
Zeros4123
Zeros (%)20.6%
Memory size156.4 KiB
2023-07-22T19:14:41.067622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median5
Q323
95-th percentile114
Maximum607
Range607
Interquartile range (IQR)22

Descriptive statistics

Standard deviation44.92779312
Coefficient of variation (CV)1.930377248
Kurtosis20.22980666
Mean23.2741
Median Absolute Deviation (MAD)5
Skewness3.761375507
Sum465482
Variance2018.506595
MonotocityNot monotonic
2023-07-22T19:14:41.443436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04123
20.6%
12131
 
10.7%
21394
 
7.0%
31033
 
5.2%
4827
 
4.1%
5631
 
3.2%
6560
 
2.8%
7513
 
2.6%
8469
 
2.3%
9400
 
2.0%
Other values (313)7919
39.6%
ValueCountFrequency (%)
04123
20.6%
12131
10.7%
21394
 
7.0%
31033
 
5.2%
4827
 
4.1%
ValueCountFrequency (%)
6071
< 0.1%
5941
< 0.1%
5101
< 0.1%
4881
< 0.1%
4741
< 0.1%

last_review
Categorical

HIGH CARDINALITY
MISSING

Distinct1507
Distinct (%)9.5%
Missing4123
Missing (%)20.6%
Memory size156.4 KiB
2019-06-23
 
575
2019-07-01
 
557
2019-06-30
 
553
2019-06-24
 
350
2019-07-07
 
292
Other values (1502)
13550 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters158770
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique358 ?
Unique (%)2.3%

Sample

1st row2019-05-26
2nd row2018-09-19
3rd row2019-05-24
4th row2019-06-23
5th row2018-08-28
ValueCountFrequency (%)
2019-06-23575
 
2.9%
2019-07-01557
 
2.8%
2019-06-30553
 
2.8%
2019-06-24350
 
1.8%
2019-07-07292
 
1.5%
2019-07-02280
 
1.4%
2019-06-22270
 
1.4%
2019-07-05252
 
1.3%
2019-06-16245
 
1.2%
2019-07-06240
 
1.2%
Other values (1497)12263
61.3%
(Missing)4123
 
20.6%
2023-07-22T19:14:42.190295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2019-06-23575
 
3.6%
2019-07-01557
 
3.5%
2019-06-30553
 
3.5%
2019-06-24350
 
2.2%
2019-07-07292
 
1.8%
2019-07-02280
 
1.8%
2019-06-22270
 
1.7%
2019-07-05252
 
1.6%
2019-06-16245
 
1.5%
2019-07-06240
 
1.5%
Other values (1497)12263
77.2%

Most occurring characters

ValueCountFrequency (%)
037740
23.8%
-31754
20.0%
125367
16.0%
223933
15.1%
912270
 
7.7%
68098
 
5.1%
75309
 
3.3%
84494
 
2.8%
53914
 
2.5%
33634
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number127016
80.0%
Dash Punctuation31754
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
037740
29.7%
125367
20.0%
223933
18.8%
912270
 
9.7%
68098
 
6.4%
75309
 
4.2%
84494
 
3.5%
53914
 
3.1%
33634
 
2.9%
42257
 
1.8%
ValueCountFrequency (%)
-31754
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common158770
100.0%

Most frequent character per script

ValueCountFrequency (%)
037740
23.8%
-31754
20.0%
125367
16.0%
223933
15.1%
912270
 
7.7%
68098
 
5.1%
75309
 
3.3%
84494
 
2.8%
53914
 
2.5%
33634
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII158770
100.0%

Most frequent character per block

ValueCountFrequency (%)
037740
23.8%
-31754
20.0%
125367
16.0%
223933
15.1%
912270
 
7.7%
68098
 
5.1%
75309
 
3.3%
84494
 
2.8%
53914
 
2.5%
33634
 
2.3%

reviews_per_month
Real number (ℝ≥0)

MISSING

Distinct790
Distinct (%)5.0%
Missing4123
Missing (%)20.6%
Infinite0
Infinite (%)0.0%
Mean1.377445991
Minimum0.01
Maximum27.95
Zeros0
Zeros (%)0.0%
Memory size156.4 KiB
2023-07-22T19:14:42.518374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.04
Q10.19
median0.72
Q32.01
95-th percentile4.67
Maximum27.95
Range27.94
Interquartile range (IQR)1.82

Descriptive statistics

Standard deviation1.683005621
Coefficient of variation (CV)1.22183057
Kurtosis11.95169963
Mean1.377445991
Median Absolute Deviation (MAD)0.62
Skewness2.435799259
Sum21869.71
Variance2.83250792
MonotocityNot monotonic
2023-07-22T19:14:42.817557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1378
 
1.9%
0.02378
 
1.9%
0.05352
 
1.8%
0.03334
 
1.7%
0.04274
 
1.4%
0.08263
 
1.3%
0.16255
 
1.3%
0.09246
 
1.2%
0.06234
 
1.2%
0.11222
 
1.1%
Other values (780)12941
64.7%
(Missing)4123
 
20.6%
ValueCountFrequency (%)
0.0118
 
0.1%
0.02378
1.9%
0.03334
1.7%
0.04274
1.4%
0.05352
1.8%
ValueCountFrequency (%)
27.951
< 0.1%
20.941
< 0.1%
19.751
< 0.1%
17.821
< 0.1%
16.221
< 0.1%

calculated_host_listings_count
Real number (ℝ≥0)

Distinct47
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.95545
Minimum1
Maximum327
Zeros0
Zeros (%)0.0%
Memory size156.4 KiB
2023-07-22T19:14:43.138082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile14
Maximum327
Range326
Interquartile range (IQR)1

Descriptive statistics

Standard deviation32.43383053
Coefficient of variation (CV)4.663081545
Kurtosis70.36535249
Mean6.95545
Median Absolute Deviation (MAD)0
Skewness8.096123898
Sum139109
Variance1051.953363
MonotocityNot monotonic
2023-07-22T19:14:43.716320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
113290
66.5%
22688
 
13.4%
31174
 
5.9%
4570
 
2.9%
5361
 
1.8%
6226
 
1.1%
8168
 
0.8%
7166
 
0.8%
327131
 
0.7%
9103
 
0.5%
Other values (37)1123
 
5.6%
ValueCountFrequency (%)
113290
66.5%
22688
 
13.4%
31174
 
5.9%
4570
 
2.9%
5361
 
1.8%
ValueCountFrequency (%)
327131
0.7%
23289
0.4%
12153
0.3%
10336
 
0.2%
9673
0.4%

availability_365
Real number (ℝ≥0)

ZEROS

Distinct366
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.9012
Minimum0
Maximum365
Zeros7176
Zeros (%)35.9%
Memory size156.4 KiB
2023-07-22T19:14:44.028312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median44
Q3229
95-th percentile359
Maximum365
Range365
Interquartile range (IQR)229

Descriptive statistics

Standard deviation131.7622264
Coefficient of variation (CV)1.167057803
Kurtosis-1.007092224
Mean112.9012
Median Absolute Deviation (MAD)44
Skewness0.75940929
Sum2258024
Variance17361.2843
MonotocityNot monotonic
2023-07-22T19:14:44.366411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
07176
35.9%
365510
 
2.5%
364225
 
1.1%
1181
 
0.9%
5136
 
0.7%
89124
 
0.6%
179121
 
0.6%
2119
 
0.6%
3118
 
0.6%
4110
 
0.5%
Other values (356)11180
55.9%
ValueCountFrequency (%)
07176
35.9%
1181
 
0.9%
2119
 
0.6%
3118
 
0.6%
4110
 
0.5%
ValueCountFrequency (%)
365510
2.5%
364225
1.1%
36398
 
0.5%
36269
 
0.3%
36148
 
0.2%

Interactions

2023-07-22T19:14:05.835923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:06.186975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:06.456501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:06.714984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:06.967645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:07.268543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:07.531149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:07.803974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:08.066897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:08.301630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:08.553645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:08.833071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:09.067113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:09.303432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:09.549403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:09.830743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:10.113444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:10.346665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:10.769651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:11.038121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:11.303763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:11.598620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:11.889965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:12.174519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:12.466919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:12.788074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:13.101877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:13.422073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:13.729322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:14.009901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:14.296614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:14.570290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:14.842382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:15.113149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:15.356482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:15.603207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:15.851304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:16.131968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:16.415074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:16.690247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:16.911732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:17.139664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:17.405231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:17.620460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:17.864911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:18.128925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:18.403974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:18.665035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:18.930210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:19.177958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:19.467183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:19.724388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:19.961381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:20.250234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:20.515714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:20.810099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:21.059936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:21.327102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:21.563087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:21.827586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:22.088664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:22.374542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:22.625910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:22.895844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:23.141168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:23.447661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:23.721689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:23.956932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:24.223781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:24.507663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:24.974811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:25.226111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:25.483877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:25.769647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:26.039279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:26.322570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:26.608766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:26.835601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:27.066601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:27.326386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:27.549069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:27.779513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:28.072086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:28.347423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:28.642641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:28.914282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:29.195443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:29.471342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:29.758659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-22T19:14:30.010116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-07-22T19:14:44.601931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-07-22T19:14:44.904155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-07-22T19:14:45.216602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-07-22T19:14:45.577992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2023-07-22T19:14:45.940140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2023-07-22T19:14:30.505997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-07-22T19:14:31.432342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

First rows

idnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
09138664Private Lg Room 15 min to Manhattan47594947IrisQueensSunnyside40.74271-73.92493Private room74262019-05-260.1315
131444015TIME SQUARE CHARMING ONE BED IN HELL'S KITCHEN,NYC8523790JohlexManhattanHell's Kitchen40.76682-73.98878Entire home/apt17030NaNNaN1188
28741020Voted #1 Location Quintessential 1BR W Village Apt45854238JohnManhattanWest Village40.73631-74.00611Entire home/apt2453512018-09-191.1210
334602077Spacious 1 bedroom apartment 15min from Manhattan261055465ReganQueensAstoria40.76424-73.92351Entire home/apt125312019-05-240.65113
423203149Big beautiful bedroom in huge Bushwick apartment143460MeganBrooklynBushwick40.69839-73.92044Private room65282019-06-230.5228
54402805LRG 2br BKLYN APT CLOSE TO TRAINS AND PARK22807362JennyBrooklynProspect-Lefferts Gardens40.66025-73.96270Entire home/apt120332018-08-280.05116
630070126✩Prime Renovated 1/1 Apartment in Upper East Side✩4968673SeanManhattanUpper East Side40.76831-73.95929Entire home/apt200522019-05-260.68171
734231172Fully renovated brick house floor in Brooklyn59642348KevinBrooklynSunset Park40.64550-74.01262Entire home/apt95192019-07-089.001106
85856760Renovated 1BR in exciting, convenient area29408349ChadManhattanChinatown40.71490-73.99976Entire home/apt179572017-04-180.1410
97929441Beautiful Loft w/ Waterfront View!1453898AnthonyBrooklynWilliamsburg40.71268-73.96676Private room10522322019-06-195.00364

Last rows

idnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
199905192459Quiet Room in 4BR UWS Brownstone10677483GregManhattanUpper West Side40.80173-73.96625Private room7010NaNNaN10
199911327940Huge Gorgeous Park View Apartment!3290436HadarBrooklynFlatbush40.65335-73.96257Entire home/apt1203132016-08-270.282327
1999223612681Shared Room 1 Stop from Manhattan on the F Train55724558TaylorQueensLong Island City40.76006-73.94080Private room55422019-06-010.65589
1999334485745Midtown Manhattan Stunner - Private room261632622RoyaltonManhattanTheater District40.75491-73.98507Private room100132019-06-163.009318
1999425616250Stylish, spacious, private 1BR apt in Ditmas Park125396920AdamBrooklynFlatbush40.64314-73.95705Entire home/apt753102019-01-030.8410
199957094539Tranquil haven in bubbly Brooklyn2052211AdrianaBrooklynWindsor Terrace40.65360-73.97546Entire home/apt1431422016-08-270.04110
199964424261Large 1 BR with backyard on UWS3447311SarahManhattanUpper West Side40.80188-73.96808Entire home/apt2002222019-05-210.5010
199974545882Amazing studio/Loft with a backyard23569951KavehManhattanUpper East Side40.78110-73.94567Entire home/apt2203282019-05-230.501293
1999826518547U2 comfortable double bed sleeps 2 guests295128Carol GloriaBronxClason Point40.81225-73.85502Private room80142019-07-011.487365
1999933631782Private Bedroom in Williamsburg Apt!8569221AndiBrooklynWilliamsburg40.71829-73.95819Private room109332019-04-281.07297